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1. INTRODUCTION 

The face feature is defined as the recognition of distinct facial characteristics. The points represent 
critical information required for classifying an individual; this is achieved by developing a model [1]. The 
model is composed of a certain number of landmark points, which is defined by the complexity of the 
object's shape and the level of detail necessary for its location. Facial landmarks are sometimes referred to as 
facial feature points, anchor points, homologous points, and essential points in the literature. Numerous 
techniques consider the image of the face and the collection of facial feature points to be a single shape [2]. 
These methods take advantage of prior knowledge about the position of the face (gleaned from labeled 
training images) and constrain the landmark searching using heuristic criteria such as areas, angles, and 
distances. As a result, the approaches can be used to estimate the shape of a previously unknown face [3]. 
These approaches include active shape modeling (ASM) [4], active appearance modeling (AAM) [5], and 
constrained local model (CLM) [6]. They evaluated the AAM and ASM models in [7] and discovered that 
ASM outperformed AAM in terms of speed and success in predicting the location of the feature point. 

The AAM uses a model of the appearance of the entire facial region, Whereas ASM constrains to 
models the image texture nearest to each landmark point. ASM is quicker and has an extensive search 
domain than AAM, while AAM better identifies the texture [7]. AAMs can find a proper place for the head. 
Hence, give accurate pose estimation, and this considers as the advantage of AAMs. However, the accuracy 
of AAMs depends mainly on the used training set. So, to enable AAM to work on different faces, it should 
provide several faces with different poses in the training set, and this is the main limitation of AAM. CLM 
model depends on the express nearby facial appearance and expresses global facial shape pattern. 

The regression-built strategies utilization holistic or nearby appearance majority of the data; 
furthermore, they might implant those global facial state patterns implicitly to joint landmark discovery. The 
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CLM model has brought remarkable developments as far back as a few years. Zadeh et al. [8] Introduced as a 
local detector, the deep constrained local model (DCLM) method and the novel dense projection network 
(DPN). DPN will be a deep neural network consisting of two vital layers: a thick aggravator layer of 
projection format. For the format projection layer, patches from claiming facial area should be mapped for a 
higher dimensional space that allows the pose to be caught faultlessly by revolutionary varieties. 

A set of masters replicated in a dense aggravator layer inside one network should exacerbate the 
assignment of an additional vital point of interest limitation. Zahraddeen et al. [9] develops a feature 
extraction strategy by combining the Gabor filter and the discrete cosine transform (DCT). The Gabor filter 
approach turned the images into Gabor magnitude and then smoothed them with an image smoother before 
applying the transformation. The effect of the Gabor filter on the DCT coefficients was evaluated using 
coefficient correlation. Geng et al. [10] developed deep learning algorithms for eliciting facial features based 
on convolutional neural networks (CNN). In contrast to the previous models generated using CNN, these 
used a trait map derived from separate layers. Numerous strategies have been applied recently to extract 
facial points or features from photos to improve feature extraction [11]-[13]. 

In [14], developed inadequate reconstruction technique to those face alignment issues. As opposed 
to a direct regression in the middle of those characteristics and shape space, the idea for shape augments 
reconstruction is presented. Moreover, a situated of coupled over complete studies termed those shape 
increase, and the local appearance is figured out in a regressive way on select deep characteristics. A 
constraind local model is considered as a class of method of locating set of points on the target image. Much 
researchs have been devoted to study the chractrisation of this model for instans see [15]-[20]. This article 
proposed an improved CLM. That gathers the whole quality of both depth and strength data to identify and 
monitor facial expressions in pictures. The depth data helps us to minimize the impact of lighting conditions. 
Moreover, it let us decrease the influence of the problem of aperture, which emerges on account of the strong 
patch response over the edges but not across them. On the other hand, our method uses depth reaction CLM 
only when there is no strong signal or inadequate lighting conditions. 

Many different types of face recognition research have been conducted with the goal of gender 
prediction; for example, see [21]-[26]. This paper employs the A-CLM model for gender prediction; at this 
point, four classifiers are used to do this, suport vector machine (SVM), support vector regression (SVR), k- 
nearest neighbors (KNN), and progressive transductive SVM (PT-SVM). In this study, the most significant 
advance is adopting a constrained local model with three patch experts, which is a radical departure from the 
previous paper's constrained local model with a single patch expert. 


2. RESEARCH METHOD 

This section contains an explanation of the methodology and procedures that were employed in this 
study. The proposed technique is described in detail in section 1. The second section covers the database that 
was used, and the third section illustrates the point distribution model (PDM). Section 4 discusses patch 
experts, while section 5 provides a summary of the fitting procedure. 

The feature extraction technique was carried out in three stages in this article. As seen in Figure 1. 
Because our approach is based on the CLM structure, it is detailed here. As illustrated in Figure 1, CLM has 
three essential components: a PDM, patch expert, and fitting technique. The PDM model locates facial 
feature foci in the image by utilizing non-rigid shape and rigid global transformation factors. Using patch 
specialists, one may simulate the appearance of neighborhood patches surrounding significant monuments. 
The following subsections detail the steps in greater detail. 


Point Distribution 
Model (PDM) — Patch Expert E S: Fitting Method 


Figure 1. The enhanced CLM basic components 


2.1. Point distribution model 

Variables p = [s,R,q,t] can be defined in our modified CLM model, and they can be diverted to 
obtain a specific model. In this equation, s represents the scaling factor, R represents object rotation, t 
represents 2D translation, and q is a vector indicating non-rigid form variation. In PDM, the following setup 
is utilized in A-CLM: 
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Here, x; = [ Xj, ¥;,Z |7 denotes to the mean quality of the it” characteristic as a vector, h; represent 
a matrix of size 3 X m of principal component, where q stand to the m dimensional vector for variables 
monitoring the non-rigid shape. Rather than using perspective projection, this method makes use of a weak- 
perspective (scaled orthographic) camera model, as linearity allows for more efficient optimization. The 
scaling factor s can be used in place of the average width in a weak-perspective model, and the translation 
vector t can be used as the center point. Due to the relatively minor fluctuations in depth along the face plane, 
this is a good approximation for the distance to the camera. We determine the maximum a posteriori 
probability (MAP) of the face model parameters p, in the proposed model. 


PPI: = Wika Da p(P) Tika p(l; = 1x D (2) 


Here, l; € {—1,1} indicates whether the i characteristic point will be modified or skewed, p(p) 
indicates the former likelihood of the model parameters p, and []j_, p(l; = 1|x; I) indicates the joint 
likelihood of the characteristic focuses x continuously adjusted at a specific point x; , provided for a force 
intensity i. To calculate p(l; = 1|x;,/), the patch experts are normally used, which is the likelihood that a 


function is aligned at x;. 


2.2. Patch experts 

We determine how the new feature points are matched depending on the regional support area by 
using local patch experts who calculate the likelihood of alignment p(l; = 1|x;, I). We use (3) as a 
probabilistic patch expert; the mean value of two logistic regression (4) and (5). 


p(lilen 2) =5% (pixie + pili Z)) (3) 
1 

p(k |x, I) = ONC (4) 
1 

p(li|xi Z) = MEZEA (5) 


Here Cz; and Cr; are the outputs of the strength and depth patch classifiers, respectively c is the 
logistic regressor intercept for the it” traits, and $d$ is the regression coefficient. 

Due to their computational simplicity, and effective implementation on convolution images, we 
utilize SVR. The main object of the SVR problem is to find the hyper plane which is close to majority pieces 
of the training landmarks as potential [16]. Assume that N training points (x1, y1), (X2, Y2), +) (Xy, Yy) with 


x; belong to R and y;belong to R,i = 1,n. In this case, we must construct a specific hyper planes and 
qualities contains the information taken from w and b. The hyperplane w might be chosen to a low standard 
when minimizing aggregate distances from hyperplane, w, training points. Essentially, Utilize Vapnik's 7t- 
insensitive loss function as: 


0: |y; — (wx, +b)| < € 


— (wx; + b)|— € : Otherwise (6) 


bi- wait Dle = fy 


Three patch expert response maps: (A) face contour, (B) nose ridge, and (C) chin part. Logistic 
response maps of regressors that use strength contain strong responses around the edges, making it difficult 
to locate the actual location of the element. Our solution mitigates the aperture issue by incorporating 
response maps from both the strength and depth images. 

The user may choose the value of € , and the regularization c controls the trade-off between 
hyperplane discovery with a massive regression execution. The QP problem associated with SVR can be 
expressed as: 


. 1 E 
min -wt Ww + CO Ši + Yei či) (7) 
w,b,g,&* 2 
Such that 


wt olx) +b — yi S€ +6; 
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Example pictures of pressure, depth, and combined response maps (Figure 2 shows the patch expert 
method tested around the pixels of an initial estimate); A major issue faced by CLM is the aperture dilemma, 
where confidence in identification around the edge is higher than along it, which in the case of strength 
response maps is especially obvious for the nose ridge and face outline. Adding the depth information helps 
solve this problem, as the strong edges in both images do not necessarily correlate, allowing further 
disambiguation of points around strong edges. 


ch response maps Patch response maps 
(CLM using intensity) {CLM using depth) 


x Ground truth 
position 


Patch response maps 


Figure 2. Three patch expert's response maps 


2.3. Fitting 

We use a standard two-step CLM fitting technique [6], [17], [18] perform a thorough local search 
around the current feature point estimation leading to a response map around each feature point, and then 
iteratively update the model parameters to optimize (2) until a convergence metric is reached. For fitting we 
utilize non-uniform regularized landmark mean shift (NU-RLMS) [2]. Provided for a starting CLM 
parameter evaluates p, NU-RLMS iteratively finds an overhaul parameter P, We assume that the non-rigid 
shape parameters q differ by Gussian distribution with the variance of the it” parameter corresponding to the 
non-rigid deformation mode's own value; the rigid parameters s, R, and t obey a non-informative uniform 
distribution. Treating the positions of the actual landmarks as unknown variables will marginalize them out 
of the probability of aligning the landmarks: 


p(lilxi LZ) = Yyey, p(lilxi L Zp ilxi ) (8) 


Where p(yi|xi ) = N (yi, xi ) pI, with p signify the contrast of noise in landmark locations resulting from 
PCA truncation in PDM construction, and W; signify all integer locations within the patch area. By 
substituting (8) into (2) we get: 


PCP) = Tiki dye, Pix LN Oi xi pI ) (9) 


The MAP term in (9) can be maximized using expectation maximization. The discrepancy between 
the available implementation and the original algorithm is attributed, in addition to frontal, to the use of 
qualified patches using profile face pictures. This leads to three sets of classifiers (frontal, left, right), with no 
answer functions for the occluded landmarks for the left and right sets. This allows us to deal with self 
occlusion since the invisible points for the fitting procedure are not evaluated. 


2.4. Face database 

In this paper, we used the labeled face parts in the wild (LFPW) database [15], a facial image 
database that includes precise information about the gender and age of each image. The LFPW image 
database was used to evaluate the proposed approach's efficiency and accuracy. The test group will consist of 
700 face images, each of which has 68 landmarks. The number of lookup locations has a measurable effect 
on the position outcome. The proposed model's performance is compared to that of existing models. 
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3. DIFFERENT BETWEEN CLM AND A-CLM 

The below picture explains the different between original CLM and A-CLM. The above one is CLM 
and the below one is A-CLM. From Figure 3 one can see that the main different of CLM and A-CLM as 
show in CLM depends on one patch to extract the feature from the face whereas A-CLM depends on three 
patches to extract the feature from the face. The other different in fitting stage as shown in Figure 3. 
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Figure 3. Difference between CLM and A-CLM 


4. Experimental results 
4.1. CLM results 

This section details out the result of the CLM as facial feature extractors together with machine 
learning algorithms as the classifiers to estimate gender. The performance of each classifier was determined 
using two distinctive measurements: mean absolute error (MAE) and cumulative score (CS). 

Table 1 shows the MAE of the machine learning classifiers i.e. SVM, SVR, KNN, and PT-SVM using 
CLM as the facial feature extraction models and SVR, CLNF as patch experts. To examine the performance of 
various machine learning algorithms as classifiers for gender estimation, experiments were conducted to test the 
classifier performance with CLM, using CLNF and SVR as patch experts separately, CLM as the feature 
extraction methods using LFPW databases. The machine learning algorithms or classifiers used in this study are 
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PTSVM, SVM, SVR, and KNN. The results of this experiment are outlined in Figures 4 and 5. In the Table 2, 
we provide the efficiency of CLM with two different patch experts and four classifiers. 


Figure 4. Cumulative score curves of the gender prediction methods using CLM with CLNF 


Figure 5. Cumulative score curves of the gender prediction methods using CLM with SVR 


Table 1. MAE of the gender estimation techniques Table 2. Efficiency of the gender prediction 
using CLM algorithms using CLM 
Method SVM SVR KNN PT-SVM Method SVM SVR KNN PT-SVM 
SVR 4.25 4.67 9.58 4.24 SVR 0.065847 0.079852 0.040545 0.076524 
CLNF 491 4.97 8.54 4.74 CLNF 0.075214 0.083695 0.095874 0.071585 


4.2. A-CLM results 

This section presents the result of the proposed model A-CLM as the facial feature extractor 
together with various machine learning algorithms to estimate gender. The performance of each classifier 
was determined using two distinctive measurements: MAE and CS. Table 3 shows the MAE of all machine 
learning algorithms (SVM, SVR, KNN, PT-SVM) using the A-CLM for facial feature extraction. Figure 6 
indicates the Cumulative Score of the proposed model as the feature extraction method with various machne 
learning algorithms (KNN, SVR, SVM, and PTSVM) as classifiers. 

Table 4 gives the average processing time to estimate gender using the proposed model as the 
feature extraction method and images from the LFPW database. The average prediction time is the time taken 
for the classifier to train and test the images. In Table 4, the result is given with respect to the machine 
learning algorithms (PTSVM, SVM, SVR, and KNN). 
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Figure 6. Cumulative score curves of the gender prediction methods using A-CLM 


Table 3. MAE of the gender prediction algorithms Table 4. Efficiency of the gender prediction 
using A-CLM algorithms using A-CLM 
Method SVM SVR KNN PT-SVM Method SVM SVR KNN PT-SVM 
A-CLM 4.44 438 8.14 3.64 A-CLM 0.059845 0.078369 0.055145 0.079852 


5. CONCLUSION 

The implementation of the proposed model and the findings obtained from it were discussed in this 
study. The experimental results for estimating face gender were reviewed in detail. In this study, the 
effectiveness of the suggested model using various classifiers was evaluated. The experimental results 
demonstrate that our suggested model for gender prediction using PT-SVM provides more accurate results 
when compared to other classifiers, as shown by our results. As a bonus, our proposed model provides more 
accurate results when compared to the original CLM. 
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